Factored Neural Language Models
نویسندگان
چکیده
Language models based on a continuous word representation and neural network probability estimation have recently emerged as an alternative to the established backoff language models. At the same time, factored language models have been developed that use additional word information (such as parts-of-speech, morphological classes, and syntactic features) in conjunction with refined back-off strategies. We present a new type of neural probabilistic language model that learns a mapping from both words and explicit word factors into a continuous space that is then used for word prediction. Additionally, we investigate several ways of deriving continuous word representations for unknown words from those of known words. The resulting model significantly reduces perplexity on sparse-data tasks when compared to standard backoff models, standard neural language models, and factored language models. Preliminary word recognition experiments show slight improvements of factored neural language models compared to all other models.
منابع مشابه
Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling
In this paper, we investigate the application of recurrent neural network language models (RNNLM) and factored language models (FLM) to the task of language modeling for Code-Switching speech. We present a way to integrate partof-speech tags (POS) and language information (LID) into these models which leads to significant improvements in terms of perplexity. Furthermore, a comparison between RN...
متن کاملFactored Language Model based on Recurrent Neural Network
Among various neural network language models (NNLMs), recurrent neural network-based language models (RNNLMs) are very competitive in many cases. Most current RNNLMs only use one single feature stream, i.e., surface words. However, previous studies proved that language models with additional linguistic information achieve better performance. In this study, we extend RNNLM by explicitly integrat...
متن کاملFactored recurrent neural network language model in TED lecture transcription
In this study, we extend recurrent neural network-based language models (RNNLMs) by explicitly integrating morphological and syntactic factors (or features). Our proposed RNNLM is called a factored RNNLM that is expected to enhance RNNLMs. A number of experiments are carried out on top of state-of-the-art LVCSR system that show the factored RNNLM improves the performance measured by perplexity ...
متن کاملCombining recurrent neural networks and factored language models during decoding of code-Switching speech
In this paper, we present our latest investigations of language modeling for Code-Switching. Since there is only little text material for Code-Switching speech available, we integrate syntactic and semantic features into the language modeling process. In particular, we use part-of-speech tags, language identifiers, Brown word clusters and clusters of open class words. We develop factored langua...
متن کاملA lecture transcription system combining neural network acoustic and language models
This paper presents a new system for automatic transcription of lectures. The system combines a number of novel features, including deep neural network acoustic models using multi-level adaptive networks to incorporate out-of-domain information, and factored recurrent neural network language models. We demonstrate that the system achieves large improvements on the TED lecture transcription task...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006